Using traits of web macro scrips to predict reuse
نویسندگان
چکیده
To help people find code that they might want to reuse, repositories of end-user code typically sort scripts by number of downloads, ratings, or other information based on prior uses of the code. However, this information is unavailable when code is new or when it has not yet been reused. Addressing this problem requires identifying reusable code based solely on information that exists when a script is created. To provide such a model for web macro scripts, we identified script traits that might plausibly predict reuse, then used IBM CoScripter repository logs to statistically test how well each corresponded to actual reuse. These tests confirmed that the traits generally did correspond to higher levels of reuse as anticipated. We then developed a machine learning model that uses these traits as features to predict reuse of macros. Evaluating this model on repository logs showed that its accuracy is comparable to that of existing machine learning models for predicting reuse—but with a much simpler structure. Sensitivity analysis revealed that our model is quite robust; its quality is greatly reduced only when parameters are set to such extreme values that the model becomes inordinately selective. Testing the model with individual traits revealed those that provided the best predictions on their own. Based on these results, we outline opportunities for using our model to improve repositories of end-user code. End users sometimes complete programming tasks by reusing existing code. Examples of reusable code include spreadsheets, educational simulations, and web macros—scripts that automate form-filling and navigation operations in a web browser [15]. Interviews of office workers have revealed the importance of reuse in web macro programming [14]. One motivation for reuse is that copying and customizing a script sometimes is faster than creating one from scratch. Another motivation is that an end user may not know how to perform a task, so a macro not only automates the task, but it also shows how to accomplish the task. Since not all end-user code is equally reusable, repositories typically provide a few limited features aimed at helping people to find reusable code. For example, many repositories display download counters, ratings, and other popularity measures for each script, as indicators of whether other people have successfully reused scripts. Repositories can also use these popularity measures to help sort search results, or as a form of " scoring system " to facilitate reuse. However, using these popularity measures …
منابع مشابه
Using traits of web macro scripts to predict reuse
To help people find code that they might want to reuse, repositories of end-user code typically sort scripts by number of downloads, ratings, or other information based on prior uses of the code. However, this information is unavailable when code is new or when it has not yet been reused. Addressing this problem requires identifying reusable code based solely on information that exists when a s...
متن کاملEvaluation of cultivated and wild barley cultivars affinities using micro and macro-morphological traits of grain, pollen, and stomata. Sayyedh Masomeh Hosseini 1*, Mahlagh Ghorbanli 2 and Hossein Sabouri 3
In an ongoing research, 24 cultivated and wild cultivars of barley were evaluated for morphological characteristics of grain, pollen, and stomata. Traits of interest included length, width, and area. Results of variance analysis showed that all samples were differed in traits of stomata, grain, and pollen at probability levels of 1 and 5%, suggesting remarkable genetic variation among studied s...
متن کاملDigging for diamonds: Identifying valuable end-user code in repositories
To a large extent, repositories of end-user code are “write-only”: much of the code that people publish never sees substantial reuse. Yet buried within these repositories are valuable pieces of code, though finding them is not always easy. In prior work, we developed a model that can predict, when a web macro is created, whether that script will be reused by anybody. In the current paper, we an...
متن کاملTraitRecordJ: A programming language with traits and records
Traits have been designed as units for fine-grained reuse of behavior in the object-oriented paradigm. Records have been devised to complement traits for fine-grained reuse of state. In this paper, we present the language TRAITRECORDJ, a JAVA dialect with records and traits. Records and traits can be composed by explicit linguistic operations, allowing code manipulations to achieve fine-grained...
متن کاملA Characterization Study on Memory Value Reuse
This paper presents a comprehensive characterization study on the exploitable memory value reuse present in programs. We compare three reuse schemes: store value reuse, loaded value reuse, and macro data reuse [12], [13]. Macro data reuse, enabled by macro data loads, capitalizes on under-utilized cache port bandwidth and makes use of the spatial locality found in port-wide macro data. Using a ...
متن کامل